21st Ecmi Modelling Week Final Report Team 2 Data Mining in an Online Encyclopedia
نویسنده
چکیده
We have surveyed and implemented different mathematical models to find the most relevant pages pertaining to some query for the case of an online encyclopedia. We have used the concepts of the webpage ranking to determine the ranking of the different topics and hence find the most important topics. The pages of an online encyclopedia are hyperlinked and this is represented by a directed graph and a corresponding matrix. The ranking of the relevant topics turns out to be an eigenvector problem. We show the existence and uniqueness of the solution and use both the power method and the linear system approach to solve this eigenvector problem. We also investigate the effect of α, a parameter used in the formation of the famous Google matrix from the adjacency matrix obtained from the directed graph, on the convergence rate of the power method. Rankings of the topics based on the three models have been implemented and obtained.
منابع مشابه
Bayesian Modelling for Machine Learning
Learning algorithms are central to pattern recognition, artificial intelligence, machine learning, data mining, and statistical learning. The term often implies analysis of large and complex data sets with minimal human intervention. Bayesian learning has been variously described as a method of updating opinion based on new experience, updating parameters of a process model based on data, model...
متن کاملCalculation of One-dimensional Forward Modelling of Helicopter-borne Electromagnetic Data and a Sensitivity Matrix Using Fast Hankel Transforms
The helicopter-borne electromagnetic (HEM) frequency-domain exploration method is an airborne electromagnetic (AEM) technique that is widely used for vast and rough areas for resistivity imaging. The vast amount of digitized data flowing from the HEM method requires an efficient and accurate inversion algorithm. Generally, the inverse modelling of HEM data in the first step requires a precise a...
متن کاملModelling Customer Attraction Prediction in Customer Relation Management using Decision Tree: A Data Mining Approach
In Today’s quality- based competitive world, known as knowledge age, customer attraction is of ultimate importance. In respect to the slogan “customer is always right”, customer relation management is the core of an organizational strategy playing an important role in four aspects of customer identification, customer attraction, customer retaining, and customer satisfaction. Commercial organiza...
متن کاملCustomer lifetime value model in an online toy store
Business all around the world uses different approaches to know their customers, segment them and formulate suitable strategies for them. One of these approaches is calculating the value of each customer for the company. In this paper by calculating Customer Lifetime Value (CLV) for individual customers of an online toy store named Alakdolak, three customer segments are extracted. The level of ...
متن کاملDatabase Queries, Data Mining, and OLAP
Modern, commercially available relational database systems now routinely include a cadre of data retrieval and analysis tools. Here we shed some light on the interrelationships between the most common tools and components included in today’s database systems: query language engines, data mining components, and online analytical processing (OLAP) tools. We do so by pairwise juxtaposition, which ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007